Improving the Performance of Coordinated Checkpointers on Networks of Workstations using RAID Techniques
نویسنده
چکیده
Coordinated checkpointing systems are popular and general-purpose tools for implementing process migration , coarse-grained job swapping, and fault-tolerance on networks of workstations. Though simple in concept , there are several design decisions concerning the placement of checkpoint les that can impact the performance and functionality of coordinated checkpointers. Although several such checkpointers have been implemented for popular programming platforms like PVM and MPI, none have taken this issue into consideration. This paper addresses the issue of checkpoint placement and its impact on the performance and func-tionality of coordinated checkpointing systems. Several strategies, both old and new, are described and implemented on a network of SPARC-5 workstations running PVM. These strategies range from very simple to more complex, borrowing heavily from ideas in RAID (Redundant Arrays of Inexpensive Disks) fault-tolerance. The results of this paper will serve as a guide so that future implementations of coordinated check-pointing can allow their users to achieve the combination of performance and functionality that is right for their applications.
منابع مشابه
A Coordinated Control Scheme for Improving Voltage Quality Using Power Electronics Interfaced DGs
Recently, increasing of non-linear loads in the power distribution network has been increased harmonics in these networks. The harmonics problems get worse and complicated by installation of power factor correction capacitors and filters. But the Distributed Generations (DGs) interface inverters with properly control can help to improve power quality, harmonic compensation and voltage unbalance...
متن کاملImproving LoRaWAN Performance Using Reservation ALOHA
LoRaWAN is one of the new and updated standards for IoT applications. However, the expected high density of peripheral devices for each gateway, and the absence of an operative synchronization mechanism between the gateway and peripherals, all of which challenges the networks scalability. In this paper, we propose to normalize the communication of LoRaWAN networks using a Reservation-ALOHA (R-A...
متن کاملMemory Exclusion: Optimizing the Performance of Checkpointing Systems
Checkpointing systems are a convenient way for users to make their programs fault-tolerant by intermittently saving program state to disk, and restoring that state following a failure. The main concern with checkpointing is the overhead that it adds to running time of the program. This paper describes memory exclusion, an important class of optimizations that reduce the overhead of checkpointin...
متن کاملA MILP Model for Coordinated Charging of Electric Vehicles in Smart Unbalanced LV Distribution Networks Using Floating Charge Method
In this paper, a mixed-integer linear programming (MILP) model is proposed to solve the charging problem of electric vehicles (EVs) using floating charge method meaning that the EV, could be supplied by each of the three phases connected to a special bus. In other words, unlike a usual household load which is only supplied by a particular phase, in the floating charge method it is assumed that ...
متن کاملImprove Replica Placement in Content Distribution Networks with Hybrid Technique
The increased using of the Internet and its accelerated growth leads to reduced network bandwidth and the capacity of servers; therefore, the quality of Internet services is unacceptable for users while the efficient and effective delivery of content on the web has an important role to play in improving performance. Content distribution networks were introduced to address this issue. Replicatin...
متن کامل